A Shortest-Paths Heuristic for Statistical Data Protection in Positive Tables

نویسنده

  • Jordi Castro
چکیده

National Statistical Agencies (NSAs) routinely release large amounts of tabular information. Prior to dissemination, tabular data need to be processed to avoid the disclosure of individual confidential information. Cell suppression is one of the most widely used techniques by NSAs. Optimal procedures for cell suppression are computationally expensive with large real-world data, and heuristic procedures are used in practice. Most heuristics for positive tables (i.e, cell values are non-negative) rely on the solution of minimum cost network flows subproblems. A very efficient heuristic based on shortest paths was already developed in the past, but it was only appropriate for general tables (i.e., cell values can be either positive or negative), whereas in practice most tables are positive. The method presented in this work sensibly combines and improves previous approaches, overcoming some of their drawbacks: it is designed for positive tables and only requires the solution of shortest path subproblems—therefore being much more efficient than other network flows heuristics. We report an extensive computational experience in the solution of randomly generated and real-world instances, comparing the heuristic with alternative procedures. The results show that the method, currently included in a software package for statistical data protection, fits NSAs needs: it is extremely efficient and provides good solutions.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Internet Traffic Engineering by Optimizing OSPF Weights

Open Shortest Path First (OSPF) is the most commonly used intra-domain internet routing protocol. ’Raffic flow is routed along shortest paths, splitting flow at nodes where several outgoing links are on shortest paths to the destination. The weights of the links, and thereby the shortest patb routes, can be changed by the network operator. The weights could be set proportional to their physical...

متن کامل

A Polynomial Algorithm for Optimal Univariate Microaggregation

Microaggregation is a technique used by statistical agencies to limit disclosure of sensitive microdata. Noting that no polynomial algorithms are known to microaggregate optimally, Domingo-Ferrer and Mateo-Sanz have presented heuristic microaggregation methods. This paper is the first to present an efficient polynomial algorithm for optimal univariate microaggregation. Optimal partitions are sh...

متن کامل

Shared Protection of Virtual Private Networks with Heuristic Methods

In our paper we analysed algorithms that seek the optimal routing configuration in backbone networks with capacity constraints. We investigated a special case when multiple full mesh demand sets (forming Virtual Private Networks (VPNs)) have to be built. We examined the pro-active path based shared protection scheme and investigated heuristic algorithms to calculate the paths. We analysed two d...

متن کامل

Segment shared protection for survivable meshed WDM optical networks

In this paper, we investigate the protection design for survivable meshed WDM optical networks, and propose a novel heuristic algorithm, which is called segment shared protection (SSP), to completely protect the dual-link failures. For each connection request, first SSP computes a least-cost working path, second SSP divides the working path into several un-overlapped segment paths according to ...

متن کامل

A Polynomial Algorithm for Optimal Microaggregation

− Microaggregation is a technique that is used by statistical agencies to limit disclosure of sensitive microdata. Noting that no polynomial time algorithms are known to microaggregate optimally, Domingo-Ferrer and Mateo-Sanz have presented heuristic methods based on hierarchical clustering and genetic algorithms to identify sub-optimal solutions. We present an efficient polynomial time algorit...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • INFORMS Journal on Computing

دوره 19  شماره 

صفحات  -

تاریخ انتشار 2007